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ABSTRACT 

Results are presented of an investigation made to (1) 
provide a description of the testing models that are currently being 
used in selected individualized instructional programs, (2) compare 
three programs along the component parts of the testing model, 
namely, selection of a program of study, criterion-referenced testing 
on the unit objectives, assignment of instructional modes, and final 
year-end assessment, and (3) briefly outline several promising lines 
of research in connection with the testing methods and decision 
procedures for individualized instructional programs The three 
programs selected for study were: Individually Prescribed 
Instruction, Program for Learning in Accordance %ath Needs, and 
Mastery Learning* An introduction, which includes a brief history, 
the content areas covered, and an indication of the extent of 
implementation, is provided for each instructional model. In 
addition, a description of each instructional paradigm and details on 
the testing model are provided. An attempt is made to pinpoint the 
decision points in each model, spelling out the consequences of the 
various possible actions in relation to each of the ^'possible true 
states of nature*" A lengthy list of references is included. . (DB) 
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1,1 Background 

While the idea of developing instructional programs in our schools 
to meet individual student rpeds is not a new theme in American education 
(see, for example, Washburnc, 1922; and Wilhelms, 1962), it has only 
been in the last decade that such programs have been implemented on any 
large-scale basis in the schools. 

The basic argument in favor of individualizing instruction cames from 
a multitude of research studies that suggest that students differ in 
interests, motivation, learning rate, goals, and capacity for learning 
among other things; and, therefore, grouped-based instruction on a coimnon 
curriculum is inappropriate to meet their educational needs. That change 
in our schools is necessary is obvious when one notes that schools provide 
successful learning experiences for only about one-third of our students 
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(Block, 1971). On the basis of Projact TALENT data Flanagan, et al. (1964) 
reported that our current instructional programs are inadequate to handle the 
large individual differences in any age or grade group. In addition, schools 
generally fail to help the student develop a sense of responsibility for his 
educational, personal, and social development or to make realistic educational 
decisions and choices about his future. 

This trend toward individualization of instruction in education has 
resulted in the development of a diverse collection of attractive alternaf:v 
models (see, for example. Gibbons, 1970; and Heathers, 1972) that, according 
to their supporters, offer new approaches to student learning which can 
provide almost all students with rewarding school experiences. These 
include: Individually Prescribed Instruction (IPI) (Glaser, 1968, 1970), 
Program for Learning in Accordance with Needs (PLA>0 (Flanagan, 1967, 
1969), Computer-Assisted Instruction (CAI) (Suppes, 196b; Atkinson, 1968; 
Atkinson and Wilson, 1969), Individualized Mathemati cs Curriculum Project 
(De Vault, Krlewall, Buchanan, and Quilling, 1969), and Mastery Learning 
(Carroll, 1963, 1970; Bloom, 1968; and Block, 1971). All of the models, 
as well as many others, represent significant steps forward in improving 
learning by individualizing instruction. They strive to actively involve 
the student in the learning process, allow students in the same class 
to be at different points in the curriculum, and permit the teacher to 
give more individual attention. 

In matters pertaining to these models, for example, the construction 
of instructional materials (Popham, 1969; Smith, 1969), curriculum design 
(Wittrock and Wiley, 1970), and computer management (Baker, 1971; Cooley 
and Glaser, 1969), there is a substantial body of knowledge. It is 
perhaps surprising to note then that the amount of information currently 



available on the testing methods and decision procedures for these 
programs is quite limited. It i.s this component that, in principle, 
facilitates the efflc5.ent movement of students through the instructional 
program, 

one reason for a lack of information is that measurement requirements 
within the context of many of the new programs require new kinds of tests. 
These are ube rpHnn-referenced tests which are constructed and 
interpreted in ways quite different from the norm-referenced tests which arc 
.ore familiar to most practitioners in the field (Popham and-Ru^k. 1969; 
Glaser and Nitko, 1971; Hambleton and Novick, 1973). 

Since one of the major purposes of individualized programs is to 
-maximize the opportunity for all students to learn, it follows that tests 
used to monitor student progress should be keyed to the instruction. 
Further, they should provide information that can be used to measure 
progress along an absolute ability continuum. Norm-referenced tests are 
constructed specifically to facilitate making comparisons among students; 
hence, they are not very well suited for making most of the decisions 
required in individualized instructional programs. 

1.2 Criterion-Referenced Testing and Measurement 

Much of the discussion in the area of criterion-referenced testing 
and measurement (for example, see Block. 1971; Ebel. 1971; Glaser and 
Nitko, 1971; and Hambleton and Novick, 1973) stems from different 
understandings as to the basic purpose of testing in the Instructional 
models described in the previous section. It would seem that in most 
cases the pertinent question is whether or not the individual has attained 
some prescribed degree of competence on an instructional performance ta.k. 
Questions of precise achievement levels and comparisons among individuals 



on these levels seem to be largely irrelevant. In many of the new 
instructional models, tests are used to determine on which Instructional 
objectives an examinee has met the acceptable performance level standard 
set by the model designer. This test information is usually used 
immediately to evaluate the student's mastery of the instructional 
objectives covered in the test, so as to appropriately locate him for 
his next instruction (Glaser and Nltko, 1971). Tests especially designed 
for this particular purpose have come to be known as criterion-referenced 
tests. Criterion-referenced teats are specifically designed to meet 
the measurement needs of the new instructional models. In contrast, the 
better known norm-referenced tests are principally designed to produce 
test scores suitable for ranking individuals on the ability measured by 
the test. A very flexible definition of a criterion-referenced test has 
been proposed by Glaser and Nitko (1971): "...[a test] that is deliberately 
constructed so as to yield measurements that are directly interpretable 
in terms of specified performance standards." According to Glaser and 
Nitko (1971), "The performance standards are usually specified by delxtiing 
some domain of tasks that the student should perform. Representative 
samples of tasks from this domain are organized into a test. Measurements 
are taken and are used to make a statement about the pcrforr^nce cf eac:. 
individual relative to that domain." Distinctions between norm- 
referenced tests and criterion-referenced tests have been presented by 
Glaser (1963), Glaser and Nitko (1971), Livingston (1972), Popham and 
Husek (1969), Ebel (1971), Block (1971), Hambleton and Gorth (1971), . 
and Hleronymous (1972). 

Hambleton and Novick (1973) have discussed the evaluation of criterion- 
referenced tests in practical situations. In their fonr.ulation, reliability 
takes the form of an index indicating the consistency of decision making 
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across parallel for^ of th. criterlon-reUranced test or across repeated 
„ea.urcn,onts. Valldltv taUos Che sa.e form except. o£ course, that a ne» 
test serves .s criterion. Both reliability and validity concepts are 
reformulated In straightforward decision-theoretic terms. However, at 
this stage of the development of a theory of crlteUon-referenced 
^asurement. the establishment of cut-off scores Is primarily a value 
judgment. [Further clarification Is provided by H^bleton and KovlcU 
(1973) and Block (1972).] 

1.3 Instructional Models Under Consideration 

The major concern in this paper Is vith Instructional models that 
include a specification of curricula In terms of behavioral objectives, 
detailed diagnosis of the entering competencies of students, the avail- 
ability of multiple instructional resources, Individual pacing and 
.e,uenclng of material, as well as the careful monitoring of student 
progress. 

in the progran. under consideration. CosEUtH=>lana^^ 
(CMI) is an optical feature. Under Qll the goal is for the computer to 
service classroo. terminals which assist the classroo. teacher in assessing 
a student's strengths and weaknesses, and to prescribe instructional 

J ^1 ^ lOfiq'l Proiect PLAN and CAI are implemented 
sequences (Cooley and Glaser. 1969). Project 

in a CMI mode whereas IPI and Mastery Learning are not. 

In sununary. the goals of individualized instructional programs 
developed along the general lines of the specifications ..bove are to 
arable students to work through the units of instruction at a pace 
reasonable for the., to develop self-direction and self-initiation, to 
encourage self-eval-uation as well as motivation for learning, and to 
demonstrate mastery in a variety of skills. 
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Cronbach (1967) reported on three major pattern« of dealing with 
individual differences which ppovide a framework for the models considered 
m this paper. Patterns of dealing with individual differences in the 
school can be described in tenns of the extent to which educational goals 
and instructional methods are varied. In one pattern, the educational 
goals and instructional methods are relatively fixed and inflexible. 
Individual differences are handled mainly by dropping students from the 
program when they begin to encounter difficulty. In a second pattern, 
goals are selected for students on the basis of interest and potential. 
They are then channeled into one fixed program or another. Individual 
.differences are handled by providing multiple optional programs. The 
models we describe in this paper fit into a third pattern where goals and 
instructional resources are individualized for the purpose of maximizing 
learning. 

1.4 Purposes of the Investigation 

The success of individualization depends to a considerable extent on 
how effectively teachers and students make decisions as to the mastery of 
specific instructional objectives, the development of individual 
prescriptions, the selection of instructional resources, etc. However, 
various writers including Baker (1971) and Glaser and Nitko (1971) have 
commented rather critically on existing testing techniques and procedures. 
Relevant background for improving such a situation would certainly include 
a review of the testing models of some of the more commonly used 
individualized instructional programs. Such a review would assist in 
defining the kinds of decisions that are made, and the information on 
which the decisions are based. This should provide a basis for developxnn 
testing methods and decision procedures specifically designed for use 
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within the context of these models. (Although it would be ideal to 
develop a general measurement model to cover all the instructional 
models, we are not prepared in this paper to advance such a model.) 

The first purpose of the investigation was to provide a description 
of the testing models that are currently being used in selected 
individualized instructional programs. Three programs were selected 
for study: Individually Prescribed Instruction , Program for Learning i n 
Accordance with Needs , and Mastery Learning . [These models well as 
others are also discussed by Baker (1971); however, he was concerned 
with their computer-based instructional management systems which is of 
only secondary interest in this paper.] These programs were selected 
in this study because they are among the best known, and because there 
is a substantial amount of information available on each. In the^ 
following sections, an introduction is provided for each instructional 
model. The introduction includes a brief history, the content areas 
covered, and an indication of the extent of implementation. Also, a 
description of each instructional paradigm and details on the testing 
model is provided. An attempt is made to pinpoint the decision points 
in each model, spelling out the consequences of the various possible 
actions in relation to each of the "possible true states of nature.*' 

The discussion of the models is based on descriptions found in 
books, papers, and reports; on-sight visits; and meetings with many of 
the developers. It should be noted however that programs are often 
implemented by teachers quite differently than they are reported in 
the literature. Also, it should be remembered that these programs are 
constantly changing; hence, it is possible that certain features of 
the models are not exactly as they are described here. In particular, 
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it is our impression that PLAN is being iitiplomentcd in a way quite 
different from what has been written about it. This is because Westinghouse 
Learning Corporation has now taken over the development and implementation 
components. 

A second purpose was to compare the three programs along the component 
parts of the testing model; namely, selection of a program of study, 
criterion-referenced testing on the unit objectives, assignment of 
instructional modes, and final year-end assessment. 

A. final purpose was to briefly outline several promising lines of 
research in connection with the testing methods and decision procedures 
for individualized instructional programs. 
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IL. Individually Prescribed Instruct/on (IPI) 

2.1 Background 

The Learning Research and Development Center (LRDC) at the University 
of Pittsburgh initiated the Individually Prescribed Instruction Project 
during the early 1960's at the Oakleaf School in cooperation with the 
Baldwin-l^itehall Public School District near Pittsburgh. Major contributors 
to the project over the years include Robert Glascr, John Bolvin, M. 
Lindvall, and Richard Cox, Initial activities concentrated on producing 
instructional materials and training mi^-^rials. More recently, research 
and evaluation activities have assumed an increasingly important role in 
Center activities. 

As of 1972 the IPI program was being implemented in over 250 schools 
around the country. Distribution of materials and other information on rho 
program is managed by Research for Better Schools, Inc., a United States 
Office of Education Regional Laboratory located in Philadelphia. At 
present, instructional materials are available in elementary mathematics, 
reading, science, handwriting, and spelling. 

2.2 Description of the Instructional Paradigm 

While we will discuss the instructional paradigm and the corres- 
ponding test model in the context of the IPI matheratics program, the 
procedures, techniques, etc., described, are in no way limited to that conten 
In fact, it should be noted that the mathematics program as implemented i^; pr 
somewhat different from what we describe here, since the LRDC is constantly 
refining and improving the program (Lindvall, person<il communication). 
Fortunately, for our purposes the basic structure of the program remains 
ae described. 

It is instructive first of all to describe the structure of the 
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mathematics curriculum. Cooley and Glaser (1969) report that the niathc- 
matice curricuium consists of 430 specified instructional objectives. 
These objectives are grouped into 88 units. (In the 1972 version of the 
program there were 359 objectives organized into 71 units.) Each unit is 
ail instructional entity which the student works through at any one time. 
There are 5 objectives per unit, on the average, the range being 1 to 14. 
A collection of units covering different subject areas in mathematics com- 
prises a level; the levels may be thought of as roughly comparable to school 
grades. For illustrative purposes, Table 2.2.1 presents the number of 
objectives for each unit in the IPX mathematics curriculum. 

The teacher is faced with the problem of locating for each student, 
that point in the curriculum where he can most profitably begin instruc*- 
tlon. Also, he is responsible for the continuous diagnosis of pupil dem- 
onstrating proficiency in each skill prescribed in his particular instruc- 
tional sequence as he moves along* 

At the beginning of each school year the teacher places the student 
within the curriculum; that is, he identifies the units in each concent 
area for which instruction is required. After completing the gross place- 
ment, a single unit is selected as the starting point for instruction, an^ 
a diagnostic instrument administered to assess the student *s competencies 
on objectives within the unit. The outcome of the unit test is infonr.ario 
appropriate for prescribing instruction on each objective In the unit. 
In addition it is also necessary to select the particular set of resourcci 
for the student. In theory, resources that match the individual's '*loarn- 
ing style** are selected. Within each unit, there are short terts to 
monitor the student's progress. Finally, upon completion of initial in- 
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Table 2.2.1 

Number of Objectives for Each Unit in the 
IPI Mathematics Curriculum 



Content Area 








Levels 










A 


B 


C 


D 


E 


F 


G 


H 


Numeration 


12 


10 


8 


8 


8 


3 


8 


4 


Place Value 




3 


5 


10 


7 


5 


2 


1 


Addition 


3 


10 


5 


8 


'6 


2 


3 


2 


Subtraction 






4 


6 


3 


1 


3 


1 


Multiplication 








8 


11 


10 


6 


3 


Division 








7 


7 


9 


5 


5 


Combination of Processes 






6 


5 


7 


4 


5 


6 


Fractions 


3 


2 


4 


6 


6 


14 


5 


2 


Money 




4 


4 


6 


4 


1 






Time 




3 


2 


7 


9 


5 


3 


1 


Systems of Measurement 




A 


3 


5 


7 


3 


2 




Geometry 




2 


2 


3 


9 


10 


7 


9 


Special Topics 






1 


3 


3 


5 


4 


" 5 



■Reproduced, by permission, from Lindvall, Cox, and Bolvin (1970), 
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struction in each unit, assessment and diagnostic testing takes p]ace. 
In the next section, we review the tests and the mechanisms for making thPi.e 
decisions. Suffice to say here that it has been found that teachers differ 
In the extent to which they follow prescri c.oion-making rules (Lindvail, 

Cox, and Bolvin, 1970). 

2.3 Details of the Testing Model 

Various reports over the last couple of years have dealt with the 
testing model and its development (LiPdvall, Cox, and Bolvin, 1970; Glascr 
and Nitko, 1971; Cox and Boston, 1967N A flow chart of the testing model 
is presented in Figure 2.3.1. To monitor a student through the program 
the follov.xng tests are used: placement tests, unit pretests, .nit post- 
tests, and curriculum-embedded tests. All of the tests are criterion- 
referenced with oerformance on the tests r.omnflrpH to oerformance standards 
for decision-making. 

How sophisticated is the decision-making process utilizing the score-- 
from the various tests? According to Glascr (1968) : 

At the present stage of our knowledge, the dacision rules 
for going from measures of student perforitiance to instruc- 
tional prescriptions may not be very complex, but little 
is known about the amount of complexity required, although, 
the individual monitoring of student i»erforaance provides 
us with a good data base to study this process. 

Promising developments in the last couple of years include increase- 

knowledfe about constructing and evaluating criterion-referenced tests. A].. 

the research on branched testing strategies (Ferguson. 1969, 1971) has much 

potential for improving the efficiency of the testing model. This sccorJ 

point will be discussed in greater detail in a later section. 

Placement Tc .._3 

When a new student enters the program, it is necessary to place the 
student at the appropriate level of instruction in each of the content ar^as. 
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PUcomcnt Tfsc 
Taken 



one specific unic 
8ClccC£d for frLuUyi 



Unit Pretest 
Titken 



prescription dfiveJoped \^ 
for one ^kiil in unit 



Studi^nt workfi on 
insftructlonal materials 
for oQc akill 



CET for skill 
taken 



^ F.ll CET 



CPan CET for last"\ 
uruitasCereU sVlll } 



E 



Init Pusctcst 
Taken 



J^M ail fcklU^ ( J'^ii A 
J V more skills J 



Figure 2.3.1. Flowchart of steps in monitoring student progress in the 
IPX program. (Reproduced, by permission, from Lindvall and Cox, 1969.) 
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Lciaser and Nitko (1971) calJcd this stage-one placf-mcut testing.] 
Typically, this is done by administering a placement test which covers 

all af the subject areas at a particular level (see Table 2, 2.1). Factors affe.-c 
ing the selection of a level for placement testing of a student include 
student age, past performance, and teacher judgment. Generally, the placement 
test covers the most difficult or most characteristic objectives within each 
area. Placement tests are administered until a unit profile identifying a 
student's competencies within each area is complete. At present, the somewhat 
arbitrary 80-85% proficiency level is used for most tests in the IPX system. 

Scores for a student on items measuring objectives in each unit and area 
in the placement test are used to define an individual program for him- Thu 
standard procedure is to assign instruction on units in which placement test 
performance on items measuring a few representative objectives in the units 
is between 20% and 80%. If the score is less than 20% for a given unit, 
the unit test in the area at the raxt lowest level is administered and the 
same criterion is applied. If he passes the unit test, he receives inslruc- 
tion in the unit in the next level. In the case where a student has a socio 
of 80% or over, he is tested on the unit in the area at the next highest 
level. [Further infcnnation is provided by Lindvall, Cox, and Bolvin (1970), 
Weisgerber (1971) and Cox and Boston (1967.) 

For example, suppose a student were to achieve scores on Im'cl E of 
60%, 90%, 60%, 60%, 30%, 30%, 25%, 90%, 50%, 10%, 0%, 30%, 30% in the thlrto>^ 
areas indicated in Table 2.2.1. It is likely that he would be prescribed 
instruction at level E in the areas of numeration, addition, subtraction, 
multiplication, division, combination of processes, money, geometry, and 
special topics. He would receive the level F placement tests in place 
value and fractions. If, for example, he scores 60% and 10% respectively, 
he would receive instruction at level F in place value and probably level E in 
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fracuionb^ He would also be administered the level i) placement tests in 
the areas of time and systems of measurement. If, for example, his scores 
were 0% and 40%, he would receive a still lower placement test in the area 
of time and would be prescribed instruction at level D in systems of 
measurement. If he scores 85% on the level C placement test in the area 
of time, he would be assigned to level D for instruction. 

In order to acquire some information on the average length of the tesc^;^ 
the level E placement tests of the 1972 edition of the IPI program were select t:J 
and examined. Analysis revealed that on the average there are 12 items 
measuring the objectives in each area (with a range of from six to 20). 

In summary, we note that the placement test has the following characLet- 
istics. provides n gross level of achievement for any student in the 
curriculum, and provides information for proper placement of students in 
the curriculum. 

Unit Pretests and Posttests 

Having received an initial prescription of units, a' student proceeds 
by taking a pretest for a unit at the lowest level of mastery on his profilvi. 
[Glaser and Nitko (1971) call this stage-two placement testing.] A unit 
pretest includes one or more items to measure each objective in the unit. 
review of the unit pretests and posttests in level F. revealed that the 
approximate number of items on a test is 37 (the range is from 21 to 6A) and rbr 
average number of items measuring each objective is six (the range is from J -lu 
to seven). Lindvall and Cox (1969) report that the length of a pretest is 
determined by the number of objectives in the instructional unit and by tne 
number ot items used to test each objective. No fixed numbuj; of items to 
muasurc each objective is used because of the diverse nature of the 
objectives* For example, they note that, **an objective iike — tiic pupil 
can solve simple addition problems involving all number combinations — will 
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require more items than would an objective Like~Lhe pupil must select 
which of three triangles is equilateral—." 

A student is prescribed instruction in each objective in the unit for 
which he fails to achieve an 85% mastery level. In the casa where the 
student demonstrates mastery of each objective, he is moved on to the next 
unit in his profile, where he again t'akes a pretest. 

The unit posttests are" simply alternate forms of the unit pretests and 
are administered to students as they complete instruction on the unit. A 
student receives a mastery score for each objective in the unit. He is 
required to repeat instruction on any objective where he fails to achieve 
an 85Z mastery score. He is directed to the next unit in his profile if 
he demonstrates mastery on each objective covered in the unit posttest. 
Those who repeat instruction on one or m re of the objectives must take the 
unit posttest again before moving on in theit program. 

In summary, pretests and posttests are available for each unit of 
instruction. The proper pretest is administered on the basis of student's 
curriculum profile, and learning tasks for each skill are assigned (or not 
assigned) on the basis uf a student's performance on items measuring the 

skill. 

Compared with students in many other types of mathematics programs, 
it is clear that the student in the IPI program spends more of his time 
taking tests. However, to some extent this can be justified on the 
grounds that testing is an integral part of the learning process in the 
IPI model. Nevertheless, there seems to be good reason for researching 
techniques to reduce testing time. 



A mastery score on each objective for a student is calculated aa th 
percentage of Items on the test that measure the objective that the stude 
answers correctly. 
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Hsu and Carlson (1972) point out several problem}; associated with the 
current version of the unit pretests and posttests. The existing system 
requires that every objective be tested; hence, the time a student spends 
taking tests is considerable. Also, because of management and scoring 
problems, feedback to the student on his results is not immediate. Further, 
students are occasionally required to take the same posttest on a second 
occasion. This raises a question about practice effect. 

One very promising way to reduce the testing time with the correlated 
result of producing better instructional decisions is suggested in the 
branched testing work of Ferguson (1969, 1971). Ferguson showed that by 
using a tailored testing strategy, a computer terminal to monitor the 
selection of test items, and information on the hierarchical structrure o£ the 
items, he was able to significantly reduce unit testing time without any 
loss in decision-making accuracy. A comprehensive review of the work in 
branched testing is out of place here; suffice to say here that major 
contribution; to the area include Ferguson (1969, 1971), and Lord (1970)- 
A review of some of the work in the area is provided by Bock and Wood 
(1971). 

Currlculutn-Embedded Tests 

As the student proceeds through a unit of instruction, his progress 
must be monitored. This is done by curriculum-embedded tests (GET) . 
As used in the mathematics IPI program, a CET is primarily a measure of 
performance on one specific objective. There are usually several test itc.T.b 
to measure the objective. A review of the CETs in level E of the program 
revealed that there are on the average about three items measuring the prir.ar^ 
objective covered in the CET. The range is from two to five. If a student 
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receives a score of 85%. he is permitted to move on to the next prescribed 
objective. Otherwise, he is sent back for additional work and then he 
takes an alternate form of the CET when he is ready. 

A secondary purpose of the CET is to pretest, in a rough way. the next 
objective in the learning sequence. (Objectives in a unit are arranged 
into a learning sequence.) Students may pretest out of the next skill in 
the sequence by achieving 85% or higher on the short test which makes up 
the second part of the CET and on part one of the CET for that .kill. Ic 
would appear from a review of level E tests that there are about two items 
measuring the secondary objective. In cases where a student does not need 
instruction on the next skill, he can skip part two of the CET and move on 
to the part two of the CET that tests the next skill he needs for his 
program. This additional pretesting of an objective in the CET gives 
students a chance to demonstrate mastery of new skills not specifically 
covered in the instruction to that point and to eliminate that instruction 
from his program. 

student Diagnosis 

once the student has been assigned to a unit of instruction and the 
objectives for which he needs instruction have been identified by the unit 
pretest, there still remains the problem of deciding which of several 
instructional methods is "optical" for him. That is. of the available 
instructional methods for a particular instructional unit, in which of the. 
would a student with a known background in the program and specific goal., 
interests, and aptitudes stand the "best" chance of learning the material/ 
Glaser and Nitko (1971) call this a diagnostic decision. 
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III. Program for Learning in Accordance with Needs (PLAN) 

3.1 Background 

Project PLAN is a major ungraded, computer supported individualized 
instruction program in education developed by the American Institutes for 
Research over the last seven years. iVor background, see Weisgerber, 1971.) 
The project was initiated by John Flanagan to handle many of the short- 
comings of our educational system as revealed by Project TALENT (Flanagan, 
et al ., 196A). 

The PLAN program is currently being used in over 70 schools with more 
than 35,000 students in grades one through twelve. Instructional materials 
are available in four areas: social studies, language arts, mathematics, 
and science. Westinghouse Learning Corporation is now responsible,, for the 
monitoring and marketing of Project FLAN materials. They also operate the 
computer installation necessary for the proper functioning of Prpject PL\N 
in a school. 

Unfortunately, the implementation of the model in 1972-73 involves far 
fewer features than was originally described by the proponents of the program 
a few years ago. Nevertheless, we will describe the more elaborate version 
of the program in this paper. 

3.2 Instructional Paradigm 

The basic unit of instruction In PLAN, called a module, is an Instruc- 
tional package ^ade up of about five behavioral objectives. It normally 
takes a student about two weeks to complete a module of instruction. Also, 
there are many objectives classified at the higher levels of Bloom's (195b) 
taxonomy that do not fit nicely into the regular modules. These are 
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named module-set objectives, and examples Include concept development and 
problem-solving skills. " They are worked Into the regular modules and prog- 
ress Is measured by PLAN achievement tests administered periodically through- 
out the program. 'According to Rhetts (1970) ther^. ar^ more than 1100 
modules in PLAN* For each module, there are severnl different teacher- 
learning units (TLU) assigned individually on the -asii of aptitudes, 
interests, learning style, etc* All modules in tht; seccndary school 
curricula are coded as to whether, 1) they arc part of a state or local 
requirement,. 2) essential for a given educational or occupational area, 
3) highly desirable for that area, 4) essential for minimum functioning 
as a citizen, 5) highly desirable for all citizens to know, or 6) would 
make the student a particularly well informed citizen. 

TLU's are coded according to: 1) reading difficulty, 2) degree to which 
it requires teacher supervision, 3) its media richness, 4) degree to which 
it requires social Involvement and/or group learning activities, 5) the 
amount of reading Involved, and 6) variety of activities in the module. 
There are, on the average, two TLU's for each module. Along the lines of 
Dunn (1970), we will describe the most complex version of the 
program— the version currently being used in the sexondary school. 

At the beginning of each year, a program of study is prepared for each 
student. This Includes a list of modules, suggested TLU's, and a recommended 
sequence in the four content areas. To really provide individualized 
instruction, it is necessary to know about student needs, goals, abilities, 
and Interests and to use the Information in developing a program of study 
(POS) for him. As part of the PLAN system then, the following information 
is collected: 



1. parent and student educational goals 

2. parent and student vocational aspirations 

3. student level of achievement and vocational interests 

A. student abilities (such as reading comprehension and arithmetic 
reasoning) 

5. past performance of student in program 

6. student's learning style. 

A variety of questionnaires and testing instruments have been developed 
to collect the above information. 

Abilities are measured each year with the Developed Abilities 
Performance Test (DAPT). This test consists of 18 scales (see, for example. 
Jung, 1970) such as those to measure arithmetic reasoning, reading 
comprehension, abstract reasoning, mechanical comprehension, and Ingenuity. 

- On the basis of the above information, a program is developed and the 
student is monitored through it by continuous module posttesting and PLAN 
achievement testing. Let us look now at the testing phase of the program 
in more detail. 

3«3 Testing Model Details 

Within a PLAN school, there exists a multitude of decisions to make ou 
each student. These include development of a program cf study, periodic 
assessment of module-set objectives, performance on the modules of 
instruction, assignment cf TLU's, and monitoring yearly, important skills. 
The major decision points are shown in Figure 3.3.1. Unfortunately, there 
is little available information on how these decisions are made. 
IH \v*loi)i;;unt of a Program of Study 

On the basis of DAPT scores which are matched to Talent data of people 
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Developed Aptitude Performance 
Tests taken 



One module selected 
for study 



Kodule Pretest 
(optional) 



Pass all skills 



Fail one or more 
skills 



Prescription developed: 
consi^^tf' Rsslfmirj* a 
relevant TLU 



Module Posttest 
taken 



Pass all 
skills 



Fail one or 
more skills 



Figure 3.3.1 



Flowchait of steps in monitoring student progress in 
Project ?LAIv\ 
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ill different occupations » the students and parents select a long range goal 

{(LRG) (one of 12 families of occupations)]* Information on the lone ranee 
goal along with parent and student information described in the last section 
is used to develop a program of study. The DAPT is also used in the deter- 
mination of the number of modules a student will study in a year* Jung (1970) 
reports that on the basis of weights derived from regression analyses, a 
quota is identified for each PLAN student in each subject area* Modules 
are then assigned to him on the basis of his LRG group membership until 
this* quota is filled. 
Developed Aptitude Performance Tests 

These tests are given at the beginning of each school year. Informa- 
tion on the length, kinds of test items, reliability and validity does not 
appear to have been published. Also, we do not know whether a different 
version of the test is used in each year, or whether the same version is 
used for several years. Regardless, unless comparability of the score 
scales for the different versions has been carefully done, we doubt whether 
the change scores (for individuals or groups) on each variable from year 
to year have very much meaning. 
PLAN Achievement Tests 

Mastery of the module-set objectives is measured at specific points 
in the curriculum using PLAN achievement tests. However, we are also 
unclear on the make-up of the PLAN achievement tests. Apparently, they 
are measured at "specified points" in the curriculum and the format of 
these tests Is sometimes something other than the paper and pencil variety. 
Module Tests 

When the student feeJs he has mastered the materials covered in a 
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module, he can take a criterion-referenced module post test which has on 
it several items measuring each objective In the module. The items are 
presented usually in a selection format to facilitate computer scoring. 
On the basis of his performance^ the computer using built-in decision rules 
makes one of four decisions. If he answers all items correctly, he is 
given a "complete** on the module and the computer print out tells him where 
to go next. If he makes a "few" errors, he is given a result of "Student 
Review", The computer specifies his performance on each objective and 
indicates the ones he should review before beginning his next module. 

Students who miss a large number of items on the test but still score 
high enough to pass, receive a result of "Teacher Certify". He is instructed 
by the teacher on which objectives to review and/or restudy. He is not 
given his next module until, in the judgment of the teacher, he has mastered 
all of the objectives. An alternative is to have the student repeat the 
module posttest. The fourth possibility is student failure to pass 
the test. In this situation, he is instructed to restudy the module with 
the same TLU or another. In the case where he misses the test again, the 
teacher intervenes and takes some appropriate action to clear up the proble.i. 
Assignment to Instructional Modes 

The basic problem was described in a discussion of the IPI program, 
i.e., what particular instructional mode (or in this case, TLU), should the 
student take to study the module so as to 'maximize his changes of learning 
the material. Dunn (1970) notes, "that the coti5)uter, from a complex 
set of decision rules, matches the student with specific TLU's". We wonder 
what those rules would be, particularly since there is no theory of Instruction 
to guide in developing optional assignment rules. To this point in time 
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educational psycholglsts have only been able to find a handful of Interaction 
between background variables and instructional method. A partial answer is 
provided by Welsgerber and Rahmlow (1971). They noted that teacher learning 
units are based upon different assumed learning styles of students and are 
gi'^ded by a pHlosophy of education (Flanagan, 1970) and a theory of 
le rnlng (Gagii^, 1965). 
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IV. Mastery Learning 

4.1 Background 

The mastery learning concept was Introduced to American Schools in the 

JfZt. 

the 1920 s with the work of Washbiime (iSa*). However, because technology 
was not developed to the point that the program could operate efficiently, 
interest in the concept steadily diminished until It was revived in the 
form of programmed instruction in the late 1950' s. (Programmed instruct- 
tlon.was an attempt to provide students with instructional materials 
that would allow, them to move at their own pace and receive constant 
feedback on their level of mastery.) The work by Carroll (1963, 1970) 
and Bloom (1968) and Bloom's students (Block, 1971; Alrasian, 1971 and 
others) was instrumental in bringing mastery learning to the attention of 
instructiciial designers and researchers. 

Since Bloom's paper in 1968, a great deal of research has been conducted; 
and the results suggest that the mastery learning model "can be easily and 
ine3q)ensively implemented at all levels of education and in subjects 
ranging from arithmetic to philosophy to physics (Block, 1970). The 
model has been used now with more than 20,000 students* 

4#2 Instructional Paradigm 

This model is quite different from IPX and PLAN in that it attempts 
to individualize instruction with a group-based instructional environnient . 
The curriculum is organized into units of instruction made up of a collec- 
tion of behavioral objectives, and for each unit one or more criterion- 
referenced tests is used to measure mastery. Individualization is handled 
via supplemental materials, feedback, and corrective techniques applied to 
students who do poorly on the posttests. 
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Mayo (1970) in describing the mastery learning model notes that: 

1. Students are made aware of course and unit expectations » so that 
they view learning as a cooperative rather than as a competitive 
venture. * 

2. Standards of mastery are set in advance for the students, and 
grading is in terms of absolute performance rather than relative 
performance. 

3. Short diagnostic tests are used at the end of each instructional 
unit. 

4. Additional learning is prescribed for those who do not demonstrate 
unit mastery. 

"^5. Additional time for learning is prescribed to students who seem 
to need it. 

The mastery learning model Is ]ess impressive in scope than PLAN, and 
the requirements for an effective testing plan are less stringent than with 
IPI or PLAN. Features of mastery learning appear to be that it is easily 
implementable, does not require the use of a computer, and is appropriate 
for almost any content area. Also if mastery learning is carried out 
properly, previous research suggests that students will achieve higher 
scores and have more interest in school and a better attitude toward school. 
Unlike the other two models, with mastery learning much of the work has been 
on research related to the correctness of the model of school learning. 
At* extensive number of content areas have been studied. 

It should be noted that there are many variations on the basic mastocy 
model as originally proposed by Bloom (1968). Some of them are summarized 
by Block (1971), and an example would be the work of Kim (1971). 
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4.3 Test Model Details 

. Block (1971) notes that» ^^To individualize instruction within the 

context of ordinary group-based instruction, mastery learning relies 
heavily on the constant flow of feedback information to teache*: and 
learner.*' It does not seem however that there is as much testing In 
mastery learning as in IPI or PLAN. A flow chart of the testing component 
is shown in Fugure A.3.1* 

The mastery learning testing model as described by Airasian (1971) 

^ represents a special case of the IPI testing program. There is no place- 

ment testing, and unit pretesting and curriculum^-embedded testing are not 
emphasized* Unit post testing and final assessment represent the two major 
kinds of testing in the program. In the spirit of Scriven (196 7J, these 
two areas are known as formative and sximmative tests. It should be noted, 
however, that formative tests or unit posttests, as they are called in IPI, 
are not used for grading. They are used for diagnosing learning difficulties 
only. 

Formative Tes t's 

A formative test is designed to cover the objectives over a short unlc 
of instruction in the mastery learning program. It is used to determine 
whether or not a student has mastered the material and to serve as a basis 
for prescribing supplemental work in areas where the student is weak 
(Airasian, 1971). Impleraenters of the mastery learning model have set 
the passing standard anywhere from 75% to 100%. There is no set number ci 
Items or format suggested to measure each objective; however, there is a 
suggestion that instructional decisions are made on the basis of responses 
to individual items. 

The formative tests in mastery learning represent the key to individual- 
^ izing instruction since it is on the basis of these scores that 
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No 



tJhit in program 
selected for study 



Group-based Instruction 
on the unit objectives 



Unit Posttest 
taken (Formative Test) 



Pass all 
skills 



Free study 
time, tutoring 
others, etc* 



Last unit of 
instruction? 



Yes 



Final assessment 
Summative Test 



Fail one or 
more skills 



f 

V 



Prescription developed: 
use of alternative 



resources 



Figure A. 3.1 Flowchart of steps in monitoring student progress in n lyru Al 
version of a mastery learning model. 
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individualization of instruction can take place. Units are kept small 
so that unit testing takes place frequently to increase the effectiveness 
of the individualization of instruction component of the program. 
Summative Tests 

The primary purpose of the summative test in the mastery learning 
model is to grade students on the basis of their achievement of course objec- 
tives. The items in the test are keyed to objectives and representative v,f 
the pool of course objectives. A criterion-referenced interpretation of 
the scores is recommended. It is proposed that cutting points be located 
on the ability continuum and grades should be assigned on the basis of a 
student's position on the continuum and not relative to other students in 
the course. A norm-referenced interpretation of the scores Is also possible. 
Final Comments 

Mastery learning is probably the least different from traditional 
instruction since the principal instruction is always grouped "based and 
final grades are assigned. (However, it is expected that because of various 
features built into the program that the final assessment testing will not 
be as threatening a situation for the student as it is in more traditional 
programs.) Differences with traditional instructional models include 
features such as individual pacing, and the big difference is the use of 
frequency tests on small units of instruction to diagnose learning prcbleno. 
Important features are the f eedback/correcting-review techniques. It 
would appear, however, that there is little in the way of sophistication 
concerning the testing model. For example, there appears to be no 
guidelines for determining the optimum nuir.ber of items to measure each 
objective on a unit posttest. An exception is the excellent work of 
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Block (1970) in investigating, among other things, the problem of settinR 
cutting scores on criterion-referenced tests to separate students Into 
two groups-masters and non-masters. His results suggest that setting 
cutting scores high (95%) may be best for cognitive learning but in the 
long run positive attitudes and interest in the subject are less likely 
to develop. With a reduction in the" cutting score to 85% there was a 
reduction in cognitive learning, but selected affective outcomes were 
ina:clmlzed« 



V* A Comparison of the Testing Models 
5*1 Introduction 

In the three previous sections we have highlighted the basic testing 
and decision-making features in three individualized instructional 
programs—IPI, PLAN, and Mastery Learning. IPI, PLAN, and ML are really 
generic terms to represent three classes of individualized instructional 
programs. (Incidentally, these represent only a small portion of the 
possible classes described by Gibbons (1970), although the classes wer 
selected for study are among the most common and ones that require, 
generally » more testing.] Within any particular class there is still 
considerable variation among the various programs caused by local needs, 
teacher preferences, and methods of implementation. We shall discuss 
general features since they remain the same from program to program. 

Within all three models, instruction is self-paced although mastery 
learning is somewhat more structured since the initial instruction on a 
unit is group-paced. With each of the models, the content is organized 
into units or modules. Generally, in IPI and ML the student is expected 
to demonstrate mastery on all the units before completing the program of 
study although by his performance on unit pretests, it is possible for b: 
to avoid instruction on any of the units. (One variation that does come 
up is the availability of "enrichment materials" which are an optional 
part of the curriculum.) In PLAN, at any grad* \^vel there are far more 
units than any student could or would ever want to master. Thus, it is 
first of all necessary to define a content domain of study for each 
student. 



In the remainder of the section, we shall limit discussion to testing 
and decision-making issues* In order to develop a framework for the dis- 
cussion, we have chosen to foctis on the following issues: 

1) selection of a program of study; 

2) criterion-referenced testing ox\ the unit objectives; 

3) assignment of instructional modes; 

4) final year-end assessment* 

These represent the extent of the decision paradigms within the three 
models. The importance and sophistication used in handling each component 
varies from one model to another. 

5.2 A Compendium of Decision Paradigms 
Selection of a Program of Study 

A program of study is that collection of units which a curriculum 
designer deems necessary for the appropriate education of the student. 

All three models are designed for utilization with a curriculum 
defined in terms of behavioral objectives arranged into blocks, units, or 
modules around a common topic or theme. Generally in IPI and ML, students 
are expected to demonstrate mastery in all of the available program 
objectives. The starting assumption is that there exists a body cf 
knowledge that the student needs to be able to demonstrate mastery In, 
This defines the program of study for the student. However, on the basis 
of high pretest results students can avoid instruction of selected units 
of instruction. 

In PLAN, each student receives a unique program of study. The more 
advanced the students the more varied their programs of study become. 



For reasons described above, selecting a program of study for a stu- 
dent in IPI or Mastery Learning is relatively easy. The decisions to be 
made reduce, basically, to determining whether students have mastered 
particular objectives. They will receive instruction only on objectives 
they have not mastered. In IPI, placement tests are used to determine 
the level of instruction in each area for the students. Here the problem 
of giving the student credit for units he has not mastered (a false 
positive error) seems to be somewhat more serious than mistakenly assign- 
ing hi'ia to instruction he does not need (a false- negative error). This 
follows since a student has a second chance to demonstrate mastery of the 
objectives in a unit through the unit pretest if he is mistakenly asslr.^>-^d 
instruction on it. To be made exempt from instruction on a unit he has 
not mastered, particularly if it is an important unit, will plague him in 
his future studies. 

In theory at least in the PLAN program, developing a program of study 
is a complex affair. Done once a year it requires a wealth of inform^iLion 
described in section 3*3 to develop the program. The danger of locating 
a student in the wrong program because of misjudgment on the part of the 
parents, teachers, or the student or because of a **less than 100% predic t:i 
system" are great; however, this is the same risk we take with select! en _ 
program in a traditional school. This is particularly serious in the hip.h 
school where there is more choice than in the elementary school programs. 
However, the flexibility of the PLAN program makes switching from one 
program to another easier. 

Criterion-Referenced Testing on the Unit Objectives 

There are three kinds of testing appropriate here: unit pretesting, 
unit posttesting, and curriculum-embedded testing. All three kinds of 
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testing are used In IPI and PLAN although unit pretesting is not stressed 
in PLAN. The possibility existed for all three kinds of testing in 
Mastery Learning; however unit pretesting is not emphasized and a student 
can avoid the curriculum- embedded testing by passing the unit posttest and 
thus avoid the remedial instructional materials ♦ It is possible that 
curriculum-embedded tests are not available in the remedial materials 
either. 

Let us briefly look now at the losses involved in making different 
kinds of decisions. It should be recalled that the unit tests (or module 
tests) measure performance on each objective or skill with several items. 
On the unit pretests, a student receiving credit for non-mastered objectiv.?^ 
will likely be "caught" on the administration of the posttest and correct 
instruction can be assigned. However, to the extent that these objectives 
are a prerequisite to others in the unit there is a potential problem. 
(Perhaps, this is a place where Bayesian statistics might be helpful in 
producing an improved profile of scores across objectives measured by the 
unit pretest. This would undoubtedly improve the overall decision-making 
accuracy. Likewise this strategy could be used on the unit post tests.) 

To assign Instruction on the basis of pretest score results to 
objectives on what a student has already mastered will prove to be 
frustrating to him; however, it should be noted th.-^t the majority of 
errors of this type occur because students are close to the cutting score- 

Receiving credit for non-mastered objectives oa tb-i posttest totlie 
extent that the objectives are prerequisites to others in future units vii^ 
Interfere with the rate of learning at that point. This error seems to b. 
less serious in terms of program efficiency if the objectives are tertnina]. 
Failing to receive credit for mastered objectives would seem to be less 
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serious since the student could move through the remedial materials quickly 
and retake the test. 

Since any decisions on the basis of curriculum-embedded test score 
results affect the student for only a limited amount of time and there 
exist checks on any decisions with the unit' (or module) posttest, there 
is little concern for developing mo're appropriate testing decision guide- 
lines at this level. 
Assignment of Instructional Modes 

•An integral component of nearly every individualized instruction pro- 
gram is the feature whereby there exist several alternate instructional • 
modes for the various units on instruction that can be assigned in sonc 
optimal way to students. In theory anyway, with IPI and PLAN, past perform- 
ance and background aptitude variables are used to assist the students in 
selecting the "best" node of instruction. With Mastery Learning, this 
feature can be operationalized following the group-based instruction and 
the unit posttests. It is at this point that decisions on the proper 
corrective feedback techniques to 'use need to be made. 

Investigators of the possible interactions between instructional ir:etr.o^, 
and aptitudes are conducting what has been termed aptitude-treatnent in- 
teraction research (Cronbach, 1967). Disappointing is the fact that while 
nearly all developers of individualized programs include this feature of 
utilizing ATI results in assigning instruction, there are few real demon- 
strations of significant Interactions between aptitudes and instructional 
modes (Bracht, 1970; Cronbach and Snow, 1969). Authors such as Glaser 
(1972) hi /e attempted to explain these results and suggest some new direc- 
tions. However, it would appear that we are far from a "theory of instiu.Lio, 
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to guide the instructional decision-making in assignment of ••optimal" in- 
structional modes to students. 

The benefits (assuming equal treatment costs) of the ATI classifica- 
tion scheme for improving the quality of instruction depend directly on 
the differences among the, slopes of the regression lines for predicting 
criterion scores with different aptitude variables in the different instruc- 
tional modes. The bigger the difference in slopes the greater is the 
potential benefit to the student for assigning one instructional mode or 
another. However in looking at the overall benefits and losses of such a 
eystem it would seem that the appropriate baseline for comparative purposes 
would need to be data derived from a traditional instructional program. 
Final Year-End Assessment 

The particular feature seems to be handled in much the same' way in 
IPX and PLAN. Information is reported on the number and nature of units 
that a student has mastered. Little or no information is provided by the 
school to students and parents that could be used for norm-referenced 
assessment. In the mastery learning model, a score is reported to measure 
achievement on the year-long activities. Both norm-referenced and 
rlterl on- referenced interpretations are possible. 
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VI. Some Directions for Further Research 

6,1 Concluding Remarks 

A review of IPI, PLAN, and mastery learning programs as well as many 
other objective-based curriculum programs not reported in this paper re-- 
veals that there are many important questions remaining to be answered 
in regards to individual assessment models. In this concluding section 
a few of the more important problem areas are discussed. 

, In order to develop an instructional model that is sensitive to 
individual needs, abilities, interests, and goals in a way that will 
allow the student to maximize his learning, we need a theory of instruction 
A theory of instruction should set down rules on the most efficient way 
of achieving knowledge (Bruner, 1964). This theory would proviHe guideline 
on how to prescribe instruction to increase learning. One paper that 
addresses the problem is Groen and Atkinson (1966). Current reports on the 
related topic of aptitude-treatment interactions are by Cronbach and G3G<;?r 
(1965), Cronbach and Snow, (1969), Bracht (1970), and Glaser (1972). 

In making decisions on the basis of criterion-referenced test scor<^s 
one assumes a good match between items and the behavioral objectives 
they are intended to measure. To the extent that test items do not 
accurately measure the objectives, any decisions based on test perfonr.ance 
will be inaccurate. To date a satisfactory methodology for item validatior; 
does not exist although several useful papers provide partial solutions 
(Dahl, 1971; Rovinelli and Hambleton, 1973). 

A theory of criterion-referenced tests and ineasurcir.ents is also 
needed to guide the users of the tests in the context of programs 
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described here. This theory should probably be based on a threshold loss 
function rather than a squared-error loss function as has been done in 
classical test theory (Lord and Novick, 1968; Hambleton and Novick, 1973). 
This theory would include reliability, validity, test scoring and item 
analysis procedures for criterion-referenced tests. It would also provide 
guidelines and techniques for setting cutting scores and allocating testing 
time* 

Another problem which has to be reckoned with for criterion-referenced 
tests is an. instance of the bandwidth-fidelity issue (Cronbach and Gleser, 
1965). When the total testing time is fixed and there is interest in 
measuring many competencies, one may be faced with the problen of whether 
to obtain very precise information about a small number of skills or less 
precise information about many more skills. Time allocation algorithino 
(analytical procedures for deciding how many items on a test should measure 
each objective) of a rather different kind than those presented by Voodbuiy 
and Novick (1968), and Jackson and Novick (1970) will be required. The 
problem of how to determine the number of items to measure each skill so 
as to maximize the percentage of correct decisions or some similar measure 
of overall decision-'making accuracy on the basis of test results has yet - 
be resolved. 

Estimation of mastery is a problem that Is encountered frequently in 
in the objective-based program. Bayesian methods have been suggested 
(Hambleton and Novick, 1973) but there has been no empirical demonstrations 
of their usefulness in this context nor are guidelines for the use of 
Bayesian methods available at the present time. Prior infonuation for 
a Bayesian solution might be mastery scores on other skills covered on 
the test or on performance on skills measured previously. (In the case 
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of po.sttesting. pretest infonnatlon could be used as the prior.) Also. 
3U8t as data trom other examinees can improve the precision of eatiir^tion 
of achievement in a nom-referenced testing situation for an Individual 
(Lord and N'vick. 1968) . so perhaps the same can be done with criterion- 
referenced measurement problems. 

Within many objective-based programs the strategy of branched testing 
would seem to be an appropriate technique, at least in situations where 
the objectives in a content area can be at.anged into hierarchical 
sequences. Some of the practical problems have been resolved in the 
Pittsburgh IPX Program so that the technique can now be used on a limited 
basis. Nevertheless. ..ny problems remain before adoption should or caa 
Proceed with other programs. For example, it would be necessary .o dcvelo 

non-automated modified version of branched testing for schools without 
computers. Also, we need to 1 much more about setting starting places, 
izes. stopping rules, etc., before we can effectively use branched 
testing in an instructional setting. 
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